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Abstract. For any fixed alphabet A, the maximum topological entropy of a 
Z d subshift with alphabet A is obviously log We study the class of nearest 
neighbor TL d shifts of finite type which have topological entropy very close to 
this maximum, and show that they have many useful properties. Specifically, 
we prove that for any d, there exists ft; such that for any nearest neighbor Z d 
shift of finite type X with alphabet A for which (log \A\) — h(X) < ft, X has 
a unique measure of maximal entropy fi. Our values of ft decay polynomially 
(like 0(d~ 17 )), and we prove that the sequence must decay at least polynomi- 
ally (like rf-0.25+o(l)) We also show 

some other desirable properties for such 
X, for instance that the topological entropy of X is computable and that fi 
is isomorphic to a Bernoulli measure. Though there are other sufficient con- 
ditions in the literature (see [9], I14| , 1210 which guarantee a unique measure 
of maximal entropy for Z d shifts of finite type, this is (to our knowledge) the 
first such condition which makes no reference to the specific adjacency rules 
of individual letters of the alphabet. 



1. Introduction 

A dynamical system consists of a space X endowed with some sort of structure, 
along with a G-action (T g ) on the space for some group G which preserves that 
structure. (For our purposes, G will always be 1 d for some d.) Two examples are 
measurable dynamics, where X is a probability space and T g is a measurable fam- 
ily of measure-preserving maps, and topological dynamics, where X is a compact 
space and the T g are a continuous family of homeomorphisms. In each setup, when 
G is an amenable group, (a class of groups which includes G = 1 d ) there is an 
invaluable notion of entropy; measure-theoretic, or metric, entropy in the setup of 
measurable dynamics, and topological entropy in the setup of topological dynam- 
ics. (We postpone rigorous definitions of these and other terms until Section [2]) 
These two notions are related by the famous Variational Principle, which says that 
the topological entropy of a topological dynamical system is the supremum of the 
measure-theoretic entropy over all (T s )-invariant Borel probability measures sup- 
ported in it. If in addition the system is taken to be expansive, then this supremum 
is achieved for at least one measure, and any such measures are called measures 
of maximal entropy. A question of particular interest is when a system supports 
a unique measure of maximal entropy. This is closely related to the concept of a 
phase transition in statistical physics, which occurs when a system supports multi- 
ple Gibbs measures. (See [15] for a discussion of this relationship.) 

One particular class of topological dynamical systems for which measures of 
maximal entropy are well-understood are the one-dimensional shifts of finite type, 
or SFTs. A one-dimensional shift of finite type is defined by a finite set A, called 



2000 Mathematics Subject Classification. Primary: 37B50; Secondary: 37B10, 37A15. 
Key words and phrases. Z d ; shift of finite type; sofic; multidimensional. 

1 



2 



RONNIE PAVLOV 



the alphabet, and a finite set T of forbidden words, or finite strings of letters 
from A. The shift of finite type X induced by T then consists of all x E A z 
(biinfinite strings of letters from A) which do not contain any of the forbidden 
words from T. The space A 1 ' is endowed with the (discrete) product topology, and 
X inherits the induced topology, under which it is a compact metrizable space. 
The dynamics of a shift of finite type are always given by the Z-action of integer 
shifts on sequences in X. Any one-dimensional SFT which satisfies a mild mixing 
condition called irreducibility has a unique measure of maximal entropy called the 
Parry measure, which is just a Markov chain with transition probabilities which 
can be algorithmically computed. For more on one-dimensional shifts of finite type 
and their measures of maximal entropy, see |20j . 

Even for these relatively simple models, things become more complicated when 
one moves to multiple dimensions. A c?-dimensional SFT is defined analogously 
to the one-dimensional case: specify the alphabet A and finite set of forbidden 
(d-dimcnsional) finite configurations J 7 , and define a shift of finite type X induced 
by T to be the set of all x G A 1 " 1 (infinite d-dimensional arrays of letters from A) 
which do not contain any of the configurations from J- . The dynamics are now given 
by the Z d -action of all shifts by vectors in Z d . The easiest class of d-dimensional 
SFTs to work with are the nearest neighbor SFTs; a d-dimensional SFT X is called 
nearest neighbor if T consists entirely of adjacent pairs of letters, meaning that 
a point's membership in X is based solely on rules about which pairs of adjacent 
letters are legal in each cardinal direction. A useful illustrative example of a nearest 
neighbor SFT is the d-dimcnsional hard-core shift T-Ld, defined by A = {0, 1}, and 
T consisting of all configurations made of adjacent pairs of Is (in each of the d 
cardinal directions). Then X consists of all ways of assigning and 1 to each site 
in Z d which do not contain two adjacent Is. 

It turns out that many questions regarding d-dimensional SFTs are extremely 
difficult or intractable. For instance, given only the alphabet A and forbidden list 
J- . the question of whether or not X is even nonempty is algorithmically undc- 
cidable! ([4], [27]) The structure of the set of measures of maximal entropy for 
multidimensional SFTs is similarly murky; it has been shown, for instance, that 
not even the strongest topological mixing properties, which are often enough to 
preclude some of the difficulties found in c?-dimcnsional SFTs, imply uniqueness of 
the measure of maximal entropy. ([5]) Even when the measure of maximal entropy 
is unique, its structure is not necessarily as simple as in the one-dimensional case: 
it may be a Bernoulli measure (for instance in the case when the SFT is all of 
A z ) , but there also exist examples where it is not even measure-theoretically weak 
mixing. 

There are existing conditions in the literature which guarantee uniqueness of the 
measure of maximal entropy, but many of these require quite strong restrictions 
on the adjacency rules defining X. For instance, it was first shown in [21] (using 
the Dobrushin uniqueness criterion) that if the alphabet of a nearest neighbor d- 
dimensional SFT X has a large enough proportion of letters called safe symbols, 
meaning that they may legally sit next to any letter of the alphabet in any direction, 
then X has a unique measure of maximal entropy. It was later shown in |14j that 
if all letters of the alphabet of a nearest neighbor d-dimcnsional SFT X are only 
"nearly safe," meaning that they can legally sit next to a large enough proportion 
of the letters in A in any direction, then again X has a unique measure of maximal 
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entropy. Both of these conditions, though useful, have two problems. Firstly, they 
make reference to combinatorial information about the adjacency rules themselves, 
rather than more coarse topological information about the system itself. Secondly, 
they are not very robust conditions; if one takes an SFT satisfying one of these 
conditions, and then adds a single letter to A with new adjacency rules which do 
not allow it to sit next to a large portion of A, then the conditions are no longer 
satisfied. 

The main focus of this paper is to define a more robust, less combinatorial, con- 
dition on multidimensional SFTs which guarantees existence of a unique measure 
of maximal entropy. Our condition is similar in spirit to the previously mentioned 
one from [14], but rather than requiring every single letter of the alphabet to be 
"nearly safe," i.e. allowed to sit next to a large proportion of the letters in A 
in any direction, we require only that a large proportion of the letters of the al- 
phabet are "nearly safe" in this sense. More specifically, call a nearest neighbor 
d-dimcnsional SFT e-full if there exists a subset of the alphabet of size at least 
(1 — e)\A\ consisting of letters which each have at least (1 — e)\A\ legal neighbors 
in each cardinal direction. Our main result is that for small enough e (dependent 
on d), every e-full nearest neighbor d-dimcnsional SFT X has a unique measure of 
maximal entropy fx. We also prove several other desirable properties for such X, 
such as showing that the topological entropy of X is a computable number and 
that fi is measure-theoretically isomorphic to a Bernoulli measure. 

Somewhat surprisingly, it is easily shown that the e-fullness condition is implied 
by a condition which makes no mention of adjacency rules whatsoever, namely 
having topological entropy very close to the log of the alphabet size. Specifically, 
for any e, there exists /3 for which any nearest neighbor d-dimensional SFT with 
entropy at least (log |A|) — f3 is e-full. This shows that all of the properties we prove 
for e-full nearest neighbor d-dimensional SFTs are shared by nearest neighbor d- 
dimcnsional SFTs with entropy close enough to the log of the alphabet size. 

We now briefly summarize the layout of the rest of the paper. In Section [21 
we give definitions and basic preliminary results required for our arguments. In 
Section [3l we state and prove our main result. In Section [4l we show that that e- 
fullness is unrelated to any existing topological mixing conditions in the literature, 
i.e. it does not imply and is not implied by any of these conditions. Section [5] com- 
pares our condition with some other sufficient conditions for uniqueness of measure 
of maximal entropy from the literature, and in Section [51 we discuss the maximal 
values of f3 e i which still guarantee uniqueness of measure of maximal entropy for 
all nearest neighbor d-dimcnsional SFTs with entropy at least (log|^4|) — /3d, in 
particular proving polynomially decaying upper and lower bounds on these values. 

2. DEFINITIONS AND PRELIMINARIES 

We begin with some geometric definitions for Z rf . Throughout, (el) represents 
the standard orthonormal basis of Z d . We use d to denote the ioo metric on points 
in Z d : d(s,t) := \\s - t\\oo = Ei I s * ~ M- For an Y scts S,T C Z d , we define 
d(S,T) := mm se s.teT d(s,t). We say that two sites s,t £ 7L d are adjacent if 
d(s,t) = 1. We also refer to adjacent sites as neighbors, and correspondingly 
define the neighbor set N t of any t £ Z d as the set of sites in Z d adjacent to t. 

This notion of adjacency gives Z d a graph structure, and the notions of paths 
and connected subsets of Z d are defined with this graph structure in mind. The 
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outer boundary of a set S C Z rf , written dS, is the set of all t E Z d \ S adjacent 
to some s £ S. The inner boundary of S, written dS, is the set of all s € S 
adjacent to some t £ Z d \ S. A closed contour surrounding S is any set of the 

form dT for a connected set T C Z d containing S. 

Definition 2.1. For any finite alphabet A, the Z d full shift over A is the set 
A z , which is viewed as a compact topological space with the (discrete) product 
topology. 

Definition 2.2. A configuration over A is a member of A s for some finite S C Z d , 
which is said to have shape S. The set Uscz d |S|<oo ^ S °^ a ^ configurations over 
A is denoted by A*. When d = 1, a configuration whose shape is an interval of 
integers is sometimes referred to MS (X word. 

Definition 2.3. For two configurations v £ A s and w £ A T with S C\T = 0, the 
concatenation of v and w, written vw, is the configuration on S U T defined by 
(vw)\s = v and (vw)\t = w. 

Definition 2.4. The Z d -shift action, denoted by {<Tt}t& d i is the Z d -action on a 
full shift A zd defined by (a t x)(s) = x(s + t) for s, t G Z d . 

Definition 2.5. A Z d subshift is a closed subset of a full shift A z which is 
invariant under the shift action. 

Each at is a homeomorphism on any 7L d subshift. and so any 7L d subshift, when 
paired with the Z d -shift action, is a topological dynamical system. An alternate 
definition for a Z d subshift is in terms of disallowed configurations; for any set 
T C A* , one can define the set X(T) := {x g A z : x\ s t T V finite S C Z d }. It 
is well known that any X(T) is a 7L d subshift, and all 7L d subshifts are represcntablc 
in this way. All Z d subshifts are assumed to be nonempty in this paper. 

Definition 2.6. A 7L d shift of finite type (SFT) is a Z d subshift equal to 
X(!F) for some finite T. If T is made up of pairs of adjacent letters, i.e. if 
T C \J d =1 A^ '^, then X is called a nearest neighbor Z d SFT. 

Definition 2.7. The language of a Z d subshift X, denoted by L(X), is the 
set of all configurations which appear in points of X. For any finite S C Z d , 
Ls(X) := n A s , the set of configurations in the language of X with shape S. 

Configurations in L(X) are said to be globally admissible. 

Definition 2.8. A configuration u g ^4 S is locally admissible for a Z d SFT 

X = ^(J 7 ) if x\t T for all T C 5. In other words, [/ is locally admissible 
if it docs not contain any of the forbidden configurations for X. We denote by 
LA(X) the set of all locally admissible configurations for X, and by LAs(X) the 
set LA(X) n A s for any finite 5 C Z d . 

We note that any globally admissible configuration is obviously locally admis- 
sible, but the converse is not necessarily true. (In general, a configuration could 
be locally admissible, but attempting to complete it to all of Z d always leads to a 
forbidden configuration.) 

Definition 2.9. For any Z d subshift and configuration w g Ls(X), the cylinder 
set [w] is the set of all x g X with x\s = w. We define the configuration set 

(w) to be the set of all configurations u in L(X) with shape containing S for which 
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u\s = w. For any set C of configurations, we use the shorthand notations [C] and 
(C) to refer to U^ecM anc ^ U to ec( u ') respectively. 

In the following definition and hereafter, for any integers m < n, [m,n] denotes 
the set of integers {m, m + 1, . . . , n}. 

Definition 2.10. The topological entropy of a Z d subshift X is 
h(X):= lim ^_l g|L nd ( 

n u ..., nd -Hx> rj" =1 m * =1 

We will also need several measure-theoretic definitions. 

Definition 2.11. For any measures fi, v on the same finite probability space X, 
the total variational distance between zx and v is 

<*(/*.") : = o Z IMW)-KW)I = max 

Definition 2.12. For any /x, ^ measures on probability spaces X and Y respec- 
tively, a coupling of /x and ^ is a measure A on X x Y whose marginals are /x and 
v; i.e. A(^4 x Y) = fx(A) for all measurable A C X and A(X X B) = v(B) for all 
measurable B C.Y. If X = Y, then an optimal coupling of /x and v is a coupling 
A which minimizes the probability \({(x, y) : x ^ ?/}) of disagreement. 

The connection between Definitions 12 . 1 1 1 and 1 2 . 1 21 is the well-known fact that for 
any [i and v on the same finite probability space, optimal couplings exist, and the 
probability of disagreement for an optimal coupling is equal to the total variational 
distance d(/x, v). 

From now on, any measure /x on a full shift A 1 ' is assumed to be a Borel 
probability measure which is shift-invariant, i.e. fx(atC) = /x(C) for any measurable 
C and t e Z d . 

Definition 2.13. For any measure /ions full shift A z , the measure-theoretic 
entropy of /x is 



h{fi):= lim — - d V n([w}) log fi([w]), 

m,...,nd— >oo T7" „ . * — ' 

where terms with /x([k;]) = are omitted from the sum. 

In Definitions 12. 101 and 12.131 a subadditivity argument shows that the limits can 
be replaced by infimums; i.e. for any n 1; . . . , rid, h(X) < — ^ log |-Lrj d ^ n j (X)\ 

Yli=l n i 

and h(fi) < M(M) 1o sM[H)- 

Definition 2.14. For any Z d subshift X, a measure of maximal entropy on X 

is a measure /x with support contained in X for which h(fx) = h(X). 

The classical variational principle (see [Hj for a proof) says that for any Z d 
subshift X, sup M h(fi) = h(X), where the supremum, taken over all shift-invariant 
Borel probability measures whose support is contained in X, is achieved. There- 
fore, any Z d subshift has at least one measure of maximal entropy. In the specific 
case when X is a nearest neighbor Z d SFT, much is known about the conditional 
distributions of a measure of maximal entropy 
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Definition 2.15. A measure /i on A 1 is called a Markov random field (or 
MRF) if, for any finite S C 1 d , any w G A s , any finite T C Z d \ S s.t. dS C T, 
and any 5 e A T with /Lt([<5]) ^ 0, 

M (H i [«) = m(M I 0). 

Informally, // is an MRF if, for any finite S" C Z d , the sites in 5 and the sites in 
Z d \(SU dS) are /x-conditionally independent given the sites on dS. The following 
characterization of measures of maximal entropy of nearest neighbor 1 d SFTs is 
a corollary of the classical Lanford-Ruclle theorem, but the self-contained version 
proved in [5] is useful for our purposes. 

Proposition 2.16. Q9J, Proposition 1.20) For any nearest neighbor Z d SFT X , all 
measures of maximal entropy for X are MRFs, and for any such measure fi and any 
finite shape S C Z d , the conditional distribution of /i on S given any 8 G Lqs(X) 
is uniform over all configurations x G Ls(X) for which xS G LA(X). 

In other words, given any nearest neighbor Z d SFT X, there is a unique set 
of conditional distributions that any measure of maximal entropy \x must match 
up with. However, this does not uniquely determine /it, as there could be several 
different measures with the same conditional distributions. For any S G -Lgs(X) as 
in Proposition 12 . 16l we denote by A s the common uniform conditional distribution 
on S given S that every measure of maximal entropy [i must have. 

Next, we define some useful conditions for SFTs and measures supported on SFTs 
from the literature, many of which we will be able to prove for nearest neighbor Z d 
SFTs which arc e-full for small enough e. 

Definition 2.17. A measure-theoretic factor map between two measures /j, 
on A z and /i' on B z is a measurable function F : A z — > B 1 which commutes 
with the shift action (i.e. F(a t x) — a t F(x) for all x G A z ) and for which //(C) = 
H{F~ 1 C) for all measurable C C B %d . 

Definition 2.18. A measure-theoretic isomorphism is a measure-theoretic 
factor map which is bijective between sets of full measure in the domain and range. 

Definition 2.19. A measure \i on A zd is ergodic if any measurable set C which 
is shift- invariant, meaning /j,(CAcr t C) = for all t G Z d , has measure or 1. 
Equivalently, \x is ergodic iff for any configurations u 1 v! over A, 

I™ , -Tw MHn(r t M)=M[«]WM)- 

n->oo [In + 1 " ^ — ' 

t£[-n,n] d 

Definition 2.20. A measure \x on A z is measure-theoretically strong mixing 

if for any configurations u, v! over A and any sequence t n G 7L d for which p n ||oo 
oo, 

lim fi([u] n a t „[u']) = fj.([u})fj,([u']). 

n— >oo 

Definition 2.21. A measure fi on A zd is Bernoulli if it is independent and iden- 
tically distributed over the sites of Z d . 

In dynamics, traditionally a measure is also called Bernoulli if it is measure- 
theoretically isomorphic to a Bernoulli measure. There is an entire hierarchy of 
measure-theoretic mixing conditions, all of which are useful isomorphism invariants 
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of measures. (See, for instance, [26].) We will not spend much space here discussing 
this hierarchy, because Bernoullicity is the strongest of all of them, and we will verify 
that the unique measure of maximal entropy of e-full nearest neighbor Z d SFTs is 
isomorphic to a Bernoulli measure for sufficiently small e. 

Definition 2.22. A topological factor map between two 1 d subshifts X and 
X' is a surjective continuous function F : X — > X' which commutes with the shift 
action (i.e. F(a t x) = OtF{x) for all x £ A 1, ). 

Definition 2.23. A topological conjugacy is a bijcctive topological factor map. 

The next three definitions are examples of topological mixing conditions, which 
all involve exhibiting multiple globally admissible configurations in a single point, 
when separated by a large enough distance. 

Definition 2.24. A 7L d SFT X is topologically mixing if for any configurations 
u, u' £ L(X), there exists n so that [u] H cr t [u'] ^ for any t £ Z d with ||£||oo > n. 

Definition 2.25. A 7L d SFT X is block gluing if there exists n so that for any 
configurations u, v! £ L{X) with shapes rectangular prisms and any t £ Z d for 
which u and atu are separated by distance at least n, [u] n er t [u'] =/= 0. 

Definition 2.26. A Z d SFT X has the uniform filling property or UFP if 

there exists n such that for any configuration u £ L(X) with shape a rectangular 
prism R = H[ai, bi], and any point x £ X, there exists y £ X such that y\n = u, 

and y\z*\H[a { -n,b t +n] = x\z d \ll{ai-n,bi+n}- 

All of these conditions arc invariant under topological conjugacy. Note the subtle 
difference in the definitions: Definitions 12.251 and 12.261 require a uniform distance 
which suffices to mix between all pairs of configurations of a certain type, whereas 
Definition 12.241 allows this distance to depend on the configurations. In general, 
standard topological mixing is not a very strong condition for Z d SFTs; usually a 
stronger condition involving a uniform mixing length such as block gluing or UFP 
is necessary to prove interesting results. (See [7] for a detailed description of a 
hierarchy of topological mixing conditions for % d SFTs.) 

The final topological properties that we will show for e-full SFTs for small e 
do not quite fit into the topological mixing hierarchy. The first involves modeling 
measure-theoretic dynamical systems within a subshift. 

Definition 2.27. A Z d subshift X is a measure-theoretic universal model if 

for any 1 d ergodic measure-theoretic dynamical system (Y, fi, (T t ) t6Z ti), there exists 
a measure v on X so that (X, v, (ct)*GZ <i ) — £*) (^t)tez d )- 

It was shown in [24] that any 7L d SFT with the UFP is a measure-theoretic 
universal model. 

We also need a definition from computability theory. 

Definition 2.28. A real number a is computable in time fin) if there exists 
a Turing machine which, on input n, outputs a pair (p n ,q n ) of integers such that 
\^- — a\ < ^, and if this procedure takes less than f(n) operations for every n. We 
say that a is computable if it is computable in time f(n) for some function /(n). 

For an introduction to computability theory, see [19] . The relationship between 
multidimensional symbolic dynamics and computability theory has been the sub- 
ject of much work in recent years, but is still not completely understood. One 
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foundational result is from |16j , where it is shown that a real number is the entropy 
of some Z d SFT for any d > 1 if and only if it has a property called right recursive 
cnumcrability, which is strictly weaker than computability and which we do not de- 
fine here. It is also shown in [16] that if a Z d SFT has the uniform filling property, 
then its entropy is in fact computable. 

We conclude this section by finally defining e-fullncss of a nearest neighbor Z d 
SFT and showing its connection to entropy. 

Definition 2.29. For any e > 0, we say that a nearest neighbor Z d SFT X with 
alphabet A is e-full if A can be partitioned into sets G (good letters) and B (bad 
letters) with the properties that 

(i) |G|>(l-e)L4| 

(ii) Vg £ G, i £ [1, d], t £ {±1}, the set of legal neighbors of g in the re^-direction 
has cardinality greater than (1 — e)\A\. 

We first show some useful technical properties for e-full nearest neighbor Z d SFTs 
with small e. 

Lemma 2.30. If X is e-full for e < 2 <i+2 > then for any locally admissible con- 
figuration w with shape S with w\qs <= G^ s and any t G Z d \S, there exists a 
nonempty subset G' of G with cardinality greater than \A\(1 — (2d + l)e) so that for 
any g' £ G'^' , the concatenation wg 1 is locally admissible. 

Proof. Since e < \A\{1 - {2d + l)e) > \A\e. We note that if \A\e < 1, then 

e- fullness of X implies that G — A and X is a full shift, in which case the lemma 
is trivial. So, we can assume that \A\(1 — (2d + l)e) > 1. Define N = N t n S , and 
note that N C dS. For any a £ A, as long as a can appear legally next to each 
of the at most 2d letters in w\n, the concatenation wa is locally admissible. Each 
letter in w\n is a G- letter, and so by e- fullness, for each t £ N, the set of letters 
which can appear legally at t adjacent to w(t) has cardinality at least \A\(1 — e), 
and so there are at least |^4|(1 — 2de) letters in A for which wa is locally admissible. 
Since \G\ > \A\(l — e), at least \A\(1 — (2d+ l)e) of these letters are in G, and we 
are done. 

□ 

Corollary 2.31. // X is e-full and e < 2 cl+2 > then any locally admissible con- 
figuration w with shape S with w\g$ consisting only of G-letters is also globally 
admissible. In particular, w can be extended to a point of X by appending only 
G-letters to w. 

Proof. As before, we can reduce to the case where |A|(1 — (2d+ l)e) > 1. Suppose 
w is a locally admissible configuration with shape S s.t. w\gs consists only of G- 
letters, and arbitrarily order the sites in 7L d \ S as s^, i £ N. We claim that for 
any n, there exists a locally admissible configuration w n with shape S U U"=i{ s i} 
such that w n \s = w, Wnl^sjuy™ ^.} consists only of G-letters, and each w n is a 
sub-configuration of w n +i- 

The proof is by induction: the existence of w\ is obvious by applying Lemma [2.30l 
to w and si, and for any n, if we assume the existence of w n , the existence of w n +i 
comes from applying Lemma 12.301 to w n and s n +i, along with the observation that 
clearly d(S U ULiM) Q (dS) U UIUM- 
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Then the w n approach a limit point x £ G , which is in X since each w n was lo- 
cally admissible. Since w was a subconfiguration of each w n , it is a subconfiguration 
of x, and so w £ L(X). 

□ 

Surprisingly, the e-fullncss property is closely related to a simpler property which 
can be stated without any reference to adjacency rules, i.e. having entropy close to 
the log of the alphabet size. 

Theorem 2.32. For any e > and d, there exists a j3 = (3(e,d) so that for a 
nearest neighbor Z d SFT X with alphabet A, 

h(X) > (log \A\) -p=>X ise- full. 

Also, for any (5 > and d, there exists an e = e(/3, d) so that 

X ise- full => h(X) > (log \A\) - (3. 

Proof. Fix any d and e > 0, and suppose that X is not e-full. This implies that if 
we define B to be the set of b £ A for which there exists i £ [l,d] and r £ {±1} so 
that there are at least e\A\ letters which cannot follow b in the re^-direction, then 
\B\ > e\A\. (Otherwise, taking G to be B c would show that X is e-full.) 

Then, there exist t <G {±1}, i £ and a set Bi C B with \Bi\ > ^\A\ so 

that for each b £ Bi, there are at least e\A\ letters which cannot follow b in the el- 
direction. This implies that there are at least |-Bi|e|^4| > f^|^4| 2 configurations with 
shape {0} U {rel} which are not in L(X), and so |L{ } U { re -.}(X)| < \A\ 2 ^1 — I^J. 
Then 

h(X) < ilog|L {0}u{el} (X)| < log |^| - l - log ^-j, 

and so taking /3(e, d) = | log 2 ^ e -i proves the first half of the theorem. For future 

reference, we note that /3(e, d) = \ log = -\ log ^1 - > -| [-53) = 33- 

Now fix any e > 0, and suppose that X is e-full. Then, for any n, we bound 
from below the size of L^^d(X). Construct configurations in the following way: 
order the sites in [1, n] d lexicographically. Then fill the first site in [1, n] d with any 
G-letter. Fill the second site with any G-letter which can legally appear next to the 
first placed G-lcttcr, and continue in this fashion, filling the sites in order with G- 
lctters, and each time placing any legal choice given the letters which have already 
been placed. By Lemma [2.301 at each step we will have at least \A\{1 — (d + l)e) 
choices. We can create more than (|j4|(1 — (d + l)e))" d configurations in this way, 
and each one is in L(X) by Lemma \2 .3 II This means that for any n, 

\L [1 , l]d (X)\>(\A\(l-(d+l)e)r d , 

which implies that h(X) > log(|A|(l - (d+ l)e)) > log |A| +log(l - (d+ l)e). Then 
taking e(/3, d) = 1 ^ 1 proves the second half of the theorem. 

□ 

We will informally refer to nearest neighbor Z d SFTs with topological entropy 
close to log \A\ as having "nearly full entropy." By Theorem 12. 32[ this condition is 
equivalent to being e-full for small e, and so we will often use the terms "e-full for 
small e" and "nearly full entropy" somewhat interchangeably in the sequel. 
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Before stating and proving our main results, here are some examples of nearest 
neighbor Z d SFTs which are e-full for small e. 

Example 2.33. Take X to have alphabet A — {0, 1, . . . , n}, and the only adjacency 
rule is that any neighbor of a must also be a 0. Then X is just the union of the 
full shift on {1, . . . , n} and a fixed point of all 0s. Clearly X is e-full for e < 

Example 2.34. Take X to have alphabet A = {0, 1, . . . , n}, and the only adjacency 
rule is that a can only appear above and below other 0s. Then X consists of points 
whose columns are either sequences on {1, . . . , n} (with no restrictions on which 
rows can appear) or all 0s. Again, clearly X is e-full for e < 

Example 2.35. Take X to be the full shift on A = {0, 1, ... , n}. Then trivially 
X is e-full for any e > 0. For the purposes of this example though, think of A as 
being partitioned into G = {1, . . . , n} and B = {0}, which would demonstrate that 
X is e-full for e < ^rj. 

These examples illustrate the different ways in which B-lettcrs can coexist with 
G-letters, which is the unknown quantity in the description of e-full SFTs. In 
Examplcs l2.33l and l2.34[ the existence of a B-letter forces the existence of an infinite 
component of B-letters. It turns out that in such examples, B-letters arc rare in 
"most" configurations of X; in particular, they have zero measure for any measure of 
maximal entropy. In contrast, Example 12 . 351 clearly has a unique Bernoulli measure 
of maximal entropy, whose support contains all configurations (including those with 
-B-letters). However, for large n, in "most" configurations the B- letters appear with 
the small frequency —. The dichotomy is that for measures of maximal entropy, 
either B-letters can only appear within infinite clusters of B-letters (and then have 
zero measure), or B- letters "coexist peacefully" with G-letters, and appear in most 
configurations, albeit with frequency less than e. 

3. Properties of c-full/nearly full entropy SFTs 
The goal of this section is to prove the following theorem. 

Theorem 3.1. For any d, and e<j := 3 $ 2 iid& > an U e d~f u ^ nearest neighbor 7L d SFT 
X has the following properties: 

(A) X has a unique measure of maximal entropy \i 

(B) h(X) is computable in time e°( n ' 

(C) X is a measure-theoretic universal model 

(D) fi is measure-theoretically isomorphic to a Bernoulli measure 

By Theorem I2.32[ all such properties also hold for nearest neighbor Z d SFTs 
with topological entropy close enough to the logarithm of their alphabet size: 

2 

Theorem 3.2. For any d and Pd = J2 = ^s^rjrr , my nearest neighbor 7L d SFT 
X with alphabet A for which h(X) > (log |-A|) — (3d has properties (A)-(D) from 
Theorem \3.1\ 

The next lemma will be fundamental to almost all future arguments, and deals 
with the conditional measure w.r.t. a measure of maximal entropy p, of a configu- 
ration consisting only of B-letters given a boundary configuration. Wc would like 
to be able to say that such configurations always have low conditional probability, 
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but this depends on the boundary. For instance, in the SFT of Example 12.341 con- 
ditioning on a boundary S G A 9 ^^ for which 6(0, -n) = 6(0,n) = actually 
forces an entire column of Os! For this reason, we for now deal only with the case 
where we condition on a boundary consisting only of G-letters. 

Lemma 3.3. For any e < 4rf 1 | _ 6 , any e-full nearest neighbor 7L d SFT X , any set 
S C 7L d , any set T C S, any S € G ds , and any measure of maximal entropy /i on 
X, 

»([B T ] I [5})<N-W, 

where N is [^(e^ 1 -4d-4)J. 

Proof. Consider any such e, X, S, T, and 6, and define N = L^ 6 " 1 — 4d — 4)J; 
since e < 4rf 1 f 6 , N > 1 and the inequality we wish to prove is nontrivial. As before, 
we can reduce to the case where \A\e > 1 since otherwise X is forced to be a full 
shift. By e-fuUness of X, \G\ > \ A\(1 - e) > \A\(1 - 2e) + 1, and by definition of N, 
l-2e > 2e(N+2d+l). Therefore, \G\ > 2\A\e(N + 2d+l) + l > 2\\A\e(N +2d+l)] . 
We can then partition G into two pieces, call them Gj and Gb, each of size at least 
|-A|e(iV + 2d+ 1), and fix any orderings on the elements of Gi, Gb, and B. 

Consider any configuration u € Lsuds(X) n (B T ) n (S), i.e. u is globally ad- 
missible with shape S U dS, u\t consists entirely of -B-letters, and u\s = 6. Then 
the locations of the B-letters within u can be partitioned into maximal connected 
components Ci(u), 1 < i < k(u) (say we order these lexicographically by least 
element), and we denote the subconfigurations of u occupying these components 
by Bi(u) = u\d, 1 < i < k(u). We will now define a family of configurations 
f(u)CL S uds(X)n(6). 

Begin by removing all Bi(u) from u, defining a new configuration v(u) with 
shape (S U dS) \ [J Ci(u) which consists only of G-letters. We fill the holes with 
shapes Ci(u) in order, starting with C\(u). For each i, we order the sites in Ci(u) 
lexicographically, and choose G-letters to fill them, one by one. We will do this in 
such a way that at each step, regardless of what letters have been assigned, we have 
N choices of letters to use, and so the total number of configurations we define by 
filling all holes, |/(w)|, will be at least N^. 

Suppose that we wish to fill a site s £ G,; (u) , meaning that each Cj (u) for j < i 
has been filled, and all sites lexicographically less than s in Ci(u) have been filled 
with G-letters. Then, consider all G-letters which can legally fill the site s given the 
letters already assigned within S U dS. Since all letters assigned arc G-lcttcrs, by 
Lemma l2.30| there are at least |A|(1 — (2d+l)e) choices. If s G dCi(u), then will use 
only letters from Gb, and if s G C;(u) \ dCi(u), then we will use only letters from 
Gj. In either case, though, since |G/| and \Gb\ are greater than |j4|e(7V + 2d+ 1), 
there are at least |A|7Ve > N\B\ possible choices. If u(s) was the kth letter of B 
with respect to the previously defined ordering on B, then we use any of the N 
letters between the ((k— l)iV + l)th and kNth letters (inclusive) in either Gb or Gi 
with respect to the previously defined orderings on these sets. Denote by f(u) the 
set of all configurations in Lsuds(X) fl (6) obtainable by using this filling algorithm 
to fill all of the sites of T in order. Now for each site s £ [j Ci(u), each configuration 
in f(u) has a letter at s which encodes the following information: whether s was a 
boundary site or an interior site within its Ci(u) (encoded by whether we chose a 
letter from Gb or G/), and the £?-letter u(s) which appeared at s in u (encoded by 
which of the possible letters in Gb or G/ we used). 
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We now show that for any configurations u ^ u' in Lsuds(X) n (B T ) n (8), 
/(it) and f(u') are disjoint. First, we deal with the case where k(u) = k(u') and 
Ci(u) = Cj(u') for 1 < i < k(u) = fc(it'). (Since they are equal, we just write 
Ci for Ci(u) = Ci(u') and k for k{u) = k{u').) Since it ^ v! , u and u' cither 
disagree somewhere outside the union of the Ci or somewhere inside. If there is 
a disagreement somewhere outside, then since all configurations in /(it) and /(it') 
agree with u and it' respectively outside the union of the Ci, it is obvious that /(it) 
and /(it') are disjoint. If there is a disagreement inside the union of the Ci, then 
take j minimal so that there is a disagreement in Cj , and take s € Cj the minimal 
site lexicographically for which u(s) ^ it'(s). For a contradiction, assume that there 
is a configuration w in /(it) PI f(u'). Since all Ci are identical and since it and it' 
agree outside the union of the Ci, we know that exactly the same sites had been 
filled, with exactly the same letters, when w(s) was chosen in the filling procedure 
defining /(it) as when w(s) was chosen in the filling procedure defining /(it'). But 
this is a contradiction; since u(s) ^ u'(s) and the same set of letters was available 
to fill s in both procedures, the same letter could not possibly have been a legal 
choice in both procedures. 

Now we deal with the case where either k(u) ^ fc'(it) or k(u) = fc'(it) and 
Cj(it) ^ Ci(it') for some 1 < i < k(u) = k(u'). This implies that either there exists 
Cj(u) disjoint from all Cj(it') (or the same statement with u and v! reversed), 
or there exist nonequal Cj(u) and Cj'(it') which have nonempty intersection (or 
the same statement with it and it' reversed). The first case is impossible since by 
definition, each Cj(u) contains some site in T, and each site in T is contained in 
some Ci(u') (and the same statement is true when it and it' are reversed). Suppose 
then that there exist j, j' so that Cj(u) ^ Cj>(u') and Cj(u) n Cj'(it') 7^ 0. Then 
there exists s which is in the boundary of Cj(it) and the interior of Cj>(u'), or vice 
versa. This means that when s is assigned in the filling procedures defining /(u) 
and /(it'), either w(s) must be from Gb in the former case and Gj in the latter, or 
vice versa. Either way, it ensures that /(u) D f(u') = 0. 

We have shown that all of the sets /(it), it <G Lsuds{X)n (B T ) n (6), are disjoint. 
Since each is a subset of Lsuds(X) n (S) and each has size at least N\ T \, we have 
shown that 

\L S uas(X) n(S)\> N^\L SU9S (X) n (B T ) n (5)\. 

Recall that since [i is a measure of maximal entropy for X, by Proposition 12.161 it 
is an MRF with uniform conditional probabilities A 5 . Therefore, 

dim 1 [5]) = a'«b*» = < N ~ m - 

\Lsuds(X) n (<))| 

□ 



Remark 3.4. For future reference, we note that since fi is an MRF, Lemma [3.31 
remains true if one additionally conditions on some sites outside S U dS, even if 
these extra sites are taken to be £?-letters. 

We are now prepared to prove Theorem 13.11 Wc fix d, define = 3 a 2 ii 1 
and consider any e^-full nearest neighbor Z d SFT X. Wc usually suppress the 
dependence on d and just write e = in the sequel. 
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Proof of (A). Recall that for any finite S C 1 d and 5 € Lgs(X), A s is the con- 
ditional distribution on S given S associated to any measure of maximal entropy, 
which is uniformly distributed over all configurations w £ A s which form a locally 
admissible configuration when combined with 6. We will show that there is only a 
single shift-invariant measure [i with these conditional distributions, implying that 
there is only a single measure of maximal entropy. Our method is similar to that 
of [2] in that we construct a coupling of A 5 and A s for pairs of boundaries 5 ^ S' 
of large connected shapes, and show that this coupling gives a high probability of 
agreement far from S and 5' , implying that A* and A s behave similarly far from 6 
and 8' . (Informally, the influence of a boundary decays with distance.) However, 
we must begin with the special case where S and 8 1 consist entirely of G-lctters. 

Choose any finite connected sets G, C C Z d with nonempty intersection, any 
site s € CnC", and any 5 G Lqc(X) and 6' £ Lqc(X) consisting only of G-letters. 
Define D := d(s 7 dC U dC). We will construct a coupling A of A 5 and A 5 which 
gives very small probability to a disagreement at s (when D is large) . 

Define G = C U dC and C'^C'U dC . Fix any ordering on the set G U G; from 
now on when we talk about any notion of size for sites in C (J C , it is assumed 
we are speaking of this ordering. For convenience, we will extend configurations 
on G and C to configurations on G and C respectively by appending 6 and 5' 
respectively Therefore, A will be defined on pairs of configurations (wi,W2) where 
W\ has shape G and W2 has shape G'; the marginalization of A which leads to a 
true coupling of A s and A s should be clear. We will define A on one site at a time, 
assigning values to both iui(s) and W2(s) when s is in CnC, and just assigning 
one of these two values if s is only one of the sets. We use Ci and £2 to denote 
the (incomplete) configurations on G and G' respectively at any step. We therefore 
begin with £1 = S and C2 =5'. At any step of the construction, we use W to denote 
the set of vertices in C U C on which either (1 or (2 have already received values. 
(In particular, at the beginning, W = dC U dC .) This means that Ci is always 
defined on W DC, and £2 is always defined on W D C. At an arbitrary step of the 
construction, we choose the next site s on which to assign values in £1 and/or Q2 as 
follows: 

(i) If there exists any site in (C U C) \ W which is adjacent to a site in W at 
which either £1 or (2 has been assigned a .B-letter, then take s to be the smallest 
such site. 

(ii) If (i) does not apply, but there exists a site in (C U C") \ W which is adjacent 
to a site in Cfl C PI W (i.e. a site at which both £i and C2 have been defined), and 
their values disagree, then take s to be the smallest such site. 

(iii) If (i) and (ii) do not apply, but there exists a site in (C U C") \ W which is 
not in Cfl C, then take s to be the smallest such site. 

(iv) If none of (i)— (iii) apply, then take s to be the smallest site in (CU C) \ W. 

Now we are ready to define A on s. If s is in C but not C (i.e. chosen according 
to case (iii)), then assign Ci( s ) randomly according to the marginalization of the 
distribution A^ 1 to s, and if s is in C but not G, then assign £2(3) randomly 
according to the marginalization of the distribution A? 2 to s. (Here we are slightly 
abusing notation: A 71 is technically only defined for 77 a boundary configuration, 
and here we may be conditioning on more than a boundary. The meaning should 



14 



RONNIE PAVLOV 



be clear though: A^ 1 , simply represents the uniform conditional distribution on 
j^C\w gj ven anc j ^2 j s s i m il ar ly defined.) 

If s £ CnC (i.e. chosen according to case (i) or case (h)), then assign Ci(s) and 
(2(s) according to an optimal coupling of the marginalizations of the distributions 
A 1 * 1 and A^ 2 to s. Since A is defined sitewise, and at each step is assigned according 
to A^ 1 in the first coordinate and A*" 2 in the second, the reader may check that it 
is indeed a coupling of A s and A s . The key property of A is the following: 

Fact 3.5. For any site s G C n C , X-a.s., wi(s) ^ u>2{s) if and only if there 
exists a path 7 from s to dC U dC contained within CnC such that for each site 
t £ 7, either one of W\(t) or W2(t) is a B-letter, or wi and W2 disagree at t, i.e. 
w 1 {t)^w 2 {t). 

Proof. The "if" direction is trivial. For the "only if" direction, assume for a contra- 
diction that wi(s) W2(s) and that no such path 7 exists. Then there is a closed 
contour T containing s and contained within C D C so that Wi\r = W2\r £ G r . 
Denote by F the set of sites inside T. Then regardless of the order of the sites on 
which A is defined, the first site in F which is assigned is done so by case (iv); since 
it is the first site in F to be assigned, its neighbors are either unassigncd or in T, 
and so cases (i)-(iii) cannot apply. Call this site t. 

Consider the state of A when t is assigned under case (iv). The sets of undefined 
sites for £1 and C2 must be the same (since case (iii) was not applied), and every 
site in C U C adjacent to a site in (C U C) \ W must be a location at which £1 
and ( 2 agree, (since case (ii) was not applied) Then the distributions A^ 1 and A*» 2 
are identical. This means that their optimal coupling has support contained in the 
diagonal, and = C2W A-a.s. It is then easy to see that A-a.s., we remain in 
case (iv) for the remainder of the construction. Therefore, A-a.s., £1 and £2 agree 
on all of F. Since s £ F, this clearly contradicts u>i(s) ^ w 2 (s). 

□ 

We will now show that when D is large, the A-probability of such a path (con- 
sisting entirely of disagreements and sites where w\ or W2 contains a -B-letter) is 
very low. Consider any (u>i,u>2) in the support of A containing 7 a path from s 
to dC U dC contained within CnC", which consists entirely of disagreements and 
sites where w\ or W2 has a _B-letter. By passing to a subpath if necessary, we can 
assume that 7 is such a path of minimal length, which clearly implies that 7 is 
contained entirely within CnC". Denote the length of 7 by L; clearly L > D. For 
technical reasons, we denote by L' = 7[y] > L the smallest multiple of 7 greater 
than or equal to L so that we can divide by 7 in the proof without dealing with 
floor or ceiling functions. We quickly note a useful fact about 7: there cannot exist 
a site t at which w\ or W2 contains a B-letter which is adjacent to three different 
sites on 7. Assume for a contradiction that this could happen (and that the sites 
adjacent to t on 7 are p, q, r, visited in that order when 7 is traversed from s to 
dC U dC.) Then the path obtained by replacing the portion p. . .q. . .r of 7 by 
ptr would be shorter than 7, violating minimality of the length of 7. We need a 
definition: 

Definition 3.6. A site t £ 7 is _B-proximate if there is q £ Nt U {t} C 7' for 
which w\(q) or W2{q) is a B-letter. 
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We now separate into two cases depending on whether the number of sites in 7 
which are B-proximate is greater than or equal to or not. 

Case 1: 7 contains at least ^4 B-proximate sites 

It was noted earlier that any B-letter can be adjacent to at most two sites in 7, 
and so any J3-letter can "induce" at most three B-proximate sites on 7 (up to two 
neighbors, and possibly itself.) We can therefore pass to a subset of ^4 B-proximate 
sites where each is adjacent to a different B- letter in either w\ or W2, and again 
pass to a subset in either W\ or w 2 (w.l.o.g. we say w\) of [~4] = 4- B-proximate 
sites on 7, each of which is adjacent to a different B-letter. 

Denote by S this set of 4- sites on 7, and by T the set of 4- neighboring B- letters 

in wi. By Lemma EDS A s ((B t )) < N'^ , where N = ^(e -1 - 4d-4)J. Since 
e < g^rg 1 > j4. The number of possible such T for any given 7 of length L is 

bounded from above by (f/)(2d)"^" < (36d)"^". Therefore, the A^-probability that 

there exists any such T for a fixed 7 is bounded from above by (180<ie)~7~. Since 
there are fewer than (2<i) L possible 7 of length L, the A 5 -probability that there 
exists any path 7 and T as denned above is less than 



£ (I80d(2d) 7 £ )- < £ 2-+ = —^=2-- 

L=£> L=D 1 VU -° 

since e < ggQ^rgg ■ The same is true of A s , and since A is a coupling of A s and 
A s , the A-probability that there exists any path 7 with at least ^4 of its sites 
.B-proximate is less than - — ^==2~~ . 

Case 2: 7 contains fewer than ^ _B-proximate sites 

In this case, there exists JJ C 7, \R\ — [4] = 4^, such that no site in R is 
i?-proximate. Since 7 consists entirely of sites where either one of w\ and W2 is a 
.B-letter or w\ and u>2 disagree, this implies that for each r G R, w\(r) 7^ W2(r). 
Also, by the definition of B-proximatc, for each r £ R, both wi|Ar r and u^Iat,- 
contain only G- letters. Order the elements of R as n, . . . , ri/ . Our fundamental 

claim is that for any i G [1, 4-], 



(1) X(w 1 (r l ) ^ w 2 {r l ) I wi(rj) ^ w 2 {rj), 1 < j < i 



and Wi\m ,w 2 \N r G G V , 1 < j' < i) < 12de. 

3' j' ' 

To prove |T]), we fix some i € [1, 4^] and condition on the facts that w\{rj) 7^ w 2 (rj) 
for 1 < j ' < i, and that w\ \ N r , W2 \n t G G V for 1 < j' < i. Then the conditional 
A-distribution on r, is a weighted average of the A-distribution assigned at site rj, 
taken over all possible evolutions of Wi and W2 in the definition of A. For any such 
evolution of w\ and W2, at the step where w\(ri) and w^ifi) were (simultaneously) 
assigned, no unassigned site in either C or C was adjacent to an assigned B-letter. 
(Otherwise, the smallest such site lexicographically would be used instead of 7^, 
under case (i) in the definition of A.) This means that at this step, d(C fl W) and 
d{C n W) both consist entirely of G-letters. 
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Therefore, independently of which evolution of Wi and W2 we consider, for any 
possible £i when n was assig ned, Lemma E3] implies that A Cl ((G Nr i)) > 1 - ^ > 
1 — lOde. This means that for any possible d when was assigned, A'- 1 \ ri was a 
weighted average of the conditional distributions A^ 1 ((x(ri)) | (x|jv r .)); where at 
least 1 — lOde of the weights are associated to x\n t . consisting entirely of G-letters. 
For any such x|jv r . , A^ 1 ((x(ri)) \ (x\N r .)) is a uniform distribution over a subset of 
A of size at least |A|(1 — 2de) by e-fullness of X. Therefore, for any such x\N r . , 

d(A^((x( n )) | (x\ Nri )),U) < 2de, 

where we use U to denote the uniform distribution over all of A. The analogous 
estimate also holds for A^ 2 by exactly the same argument. Since at least 1 — lOde 
of the measures A < = 1 | ri and A < ' 2 | ri have been decomposed as weighted averages of 
distributions within 2de of U, 

d(A Cl | n ,A C2 | ri ) < 12de. 

Since the marginalization of A to is an optimal coupling of these two measures, 
this marginalization gives a probability of less than 12de to the event Wi(r,) 7^ 
W2(j~i). Since the same is true for every evolution of Wi and W2, we have shown 

that conditioned on wi(r,) ^ 102(1",) for 1 < j < i and u>i|Ar r , W2\n,. eG V for 

3' 3' 

1 <f < i, A(wi(r l ) ^ w 2 (n)) < 12de, verifying dp. 

From this, it is clear that A (no site in R is B — proximate) < (12de)f by de- 
composing it as a product of conditional probabilities. There are at most {2d) L 
choices for 7 and at most (kA < 18^" choices for the subset R, so the A-probability 
that there is any path 7 with at least y non-£?-proximate sites is less than 

£(216d(2d) 7 e )^< £2-* = — L 

since e < 432 1 2 , dS . 
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Clearly any path 7 from s to dC U dC contained within C (1 C" which consists 
entirely of disagreements and locations where either w\ or W2 has a -BTetter must 
be in either Case 1 or Case 2, so we have shown that the A-probability that there 
exists any such path is less than - — |^=2~~ + - — \j=2~~ < Z2~^ for a constant 
Z independent of D. By Fact 13.51 A-a.s. w\{s) 7^ W2(s) if and only if there exists 
such a 7, and so A(wi(s) 7^ W2(s)) < Z2 - "?. Clearly this implies via a simple union 
bound that for any shape S consisting of sites at a distance at least D from dC 
and dC\ \{wi\ s + w 2 \s) < Z\S\2-%. 

Since A is a coupling of A s and A s , we have shown the following: 

Fact 3.7. For any S £ Lqc{X) and 8' € Lgc(X) consisting only of G-letters, and 
for any shape S G C f~l C" , 

(2) d(A 5 | s ,A 5 '| s ) <Z\S\2-$, 

where D := d(s,dCUdC"). 

We note that {2j is very close to the classical condition of (weak) spatial mixing 
with exponential rate (see [5] for a survey of various results and discussions involving 
spatial mixing of MRFs), but with the important difference that it only holds here 
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for boundaries consisting entirely of G-letters. To finish the proof, we must now 
consider general boundaries r\. For this portion of the proof, we will use only the 
fact that d ^A s \s 1 A s \s^j decays to as D — > 00, ignoring the exponential rate. 

Roughly speaking, the strategy is to show that for any connected set G C Z d , 
any measure of maximal entropy fi on X and for any finite shape S C G far from 
dC ', there are sets of boundary conditions 77 G Lqq(X) with /^-measure approaching 
1 (as d(S, dC) — »• 00) whose members have the following property: with very high 
A^-probability there exists a closed contour 5 of G-letters contained in C, containing 
S, and which is far from S. Then, for any such 77, most of A^ls can be written 
as a weighted average over A s \s for such S. We have already shown that A 5 \s has 
very little dependence on S consisting only of G-letters when S is far from S, and 
so A n ((x\s)) has little dependence on 77. We can write fj,\s as a weighted average 
of A'' |s, and since the above shows that dependence on 77 fades as G becomes large 
for sets of 77 of measure approaching 1, has only one possible value. Since S 
was arbitrary, this shows that \i is the unique measure of maximal entropy on X. 

In the sequel, we use the notation Sf^Tto denote the event that there is a path 
of _B-letters connecting some site in S to some site in T. We first need to prove the 
following: 

Fact 3.8. For any measure of maximal entropy fi on X , 

(3) lim fj,(0 <-> d[~n,n] d ) = 0. 

n— >oo 

Proof. For a contradiction, suppose ([3]) is false. Then since the events <-> d[— n, n] d 
are decreasing, there exists a > so that for all n, /i(0 <-> d[— 2n, 2n] d ) > a. 
Then by stationarity, for each s <G [— n, n] d , fj,(s o (s + d[— 2n, 2n] d )) > a. Since 
s + [— 2n, 2n] d 3 [— n, n] d , s <R- (s + d[— 2n, 2n} d ) implies s <H- d[— n, n] d , and so 
fi(s O d[— n, n] d ) > a for all s e [— n,n] d . Then 

(4) fi(\{s G [-n,n] d : s ^ d[-n,n] d }\ > 0.5a\[-n,n] d \) > 0.5a. 

Since /i is an MRF with conditional probabilities {A 5 }, we may write /i|r_„ in id as 
a weighted average: 

A t l[-n,n] <! = X/ 

where 5j ranges over configurations in Lgi nn ]d(X) (and ^ = By ((U), at 

least 0.25a of the weights pi are associated to Si for which 

(5) A 5 '(\{s e [-n,n] d : s ^ d[-n,n] d }\ > 0.5a\[-n,n] d \) > 0.25a. 

In other words, if we denote the set of 8i which satisfy <j5j) by P, then p([P}) > 
0.25a. Make the notation K = \{w G L[_ n _ 1 .„ +1 ]d (X) : ?7j|a[_ nj „]<i € P}\. Recall 
that - Y,weL _ ln+1]d (x) MM) lo sMM) > | [-77 - 1, n + l] d |/i(pi), which equals 
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\[—n — 1, n + l] d \h(X) since fj, is a measure of maximal entropy. Then 
(6) |[-n-l,n+l] d |/i(X)<- E M(M)logM(M) = 

-£ E MM)iogM(N) 

-E E MM)iogM(M) 

i5eP«;ei [ _ n _ ljn+1]( i(X),«;| a[ _ n>n]<i =« 

< (1 - 0.25a) log \L hn _ hn+1]d (X)\ + 0.25a log K - log(0.25a), 

where the last inequality uses the easily checked fact that for any positive real 
numbers ft, . . . , f3 k } with sum ft XX _ Ak>gft) < ftlog k - log/3). 

By definition of topological entropy, for any 6 > 0, there exists Ng such that 
(l-0.256»a)log| J L h „ i „ ] d(X)| < | [-n, n] d \h(X)+log(0.25a) for n > N e . This means, 
in particular, that for n > Ng, 

(1 - 0.250a) log \L { _ n _ hn+1] 4X)\ < \[-n - 1, n + 1]>P0 + log(0.25a) 
< (1- 0.25a) log \L { _ n _ hn+1]d (X)\ + 0.25a log K (by ©) 

log if > (1 - 0) log|i [ _„_ lj „ +1] ,(X)|. 

Therefore, for n > Ng, K > \L^_ n _ l n+1 ^d(X)\ 1 ~ e . Since there are fewer than 

\ A \\d[-n,n] d \ e i ements Q f P t h er e exists 8 G P and a set of at least |L i— wn^W! 1 

configurations w for which u><5 £ L(X). Then, since 8 satisfies (O, there is a set 

of configurations S C L[_„_ l jl+1 ]d(X) of size at least - ' 2 5 - ■ L | ~ ^ ~ ^"^j 1 d\ X , each 

of which contains at least 0.5a| [— n, n] d \ sites connected to d[—n,n] d by paths of 
-B-letters. 

We now perform a very similar replacement procedure to the one used in the 
proof of Lemma 13.31 We will not rewrite the entire construction, rather mainly 
summarizing the changes from the previous procedure. Consider any u € S, and 
take Ci{u), 1 < i < k, to be the maximal connected components of locations of 
.B-letters in u which have nonempty intersection with d[— n, n] d . Since u E S, 
^2\Ci(u)\ > 0.5a|[— n, n] d \. For each i, define Bi(u) to be the subconfiguration 
u \ci(u) °f u occupying C,(u). Then, remove all Bi(u) from u, and fill the holes 
in various ways using the same procedure as in Lemma 13.31 where each site s is 
filled with a G- letter which encodes the information about the i?-letter u(s) and 
whether s was on the boundary of its component Ci(u) or in the interior. As in 
Lemma 13.31 this yields a set f(u) C L\_ n _x , n +i] d (X) of configurations of size at 

least N°- 5a ^- n ' n ^\ , where N = L^e -1 - 4d - 4)J . Here, we will actually only need 
the fact that N > 2, which is true since e < 4d 1 f8 . We have 

E 1/0)1 > \S\2 - 5a ^ n - n ^ . 

u 

In Lemma 13.31 we showed that all of the sets f(u) were disjoint, which is not 
necessarily the case here. However, it is still true that if there exist Ci{u) and Cj(u') 
which arc unequal but have nonempty intersection, then /(u)D/(u') = 0. It is also 
still true that if there exist Ci(u) and Cj(v!) which are equal, but Bi(u) ^ Bj(u'), 
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then f(u) fl f(u') = 0. The only new case under which /(it) and f(u') might 
not be disjoint is if all pairs Ci(u) and Cj(u') are either disjoint or equal, and if 
Bi(u) = Bi(u') whenever Ci(u) = Cj(u'); suppose we are in this case. Fix any 
v G Lr_ n -i t n+i\d(X), and let us bound the size of F v := {u : v G f(u)}\ from 
above. 

For each s G d[— n, n] d , either s is in some C,(m s ) for some configuration u s G 
or not. If it is, then denote the configuration Bi(u s ) by -B(s). By the above analysis, 
for every u G F v , either B(s) = Bi(u) for some i or all Bi(u) are disjoint from B(s) 
(in particular, this would imply that no Bi(u) contains s). This in turn implies that 
for every u G F v , the set {Bi(u)} is just a subset of {B (s)} se Q^ n n ^d . Since knowing 
{Bi(u)} along with v uniquely determines u, and since there are at most |<9[— n, n] d \ 
sets B(s), \F V \ < 2l 9 [- n <"] d l. In other words, each v is in at most 2^ n ' n ^ d \ of the 
sets f(u). Since this is true for any v, we have shown that 

\- L [-n-l,n+l] d [^ )\ - 2\d[-n,n] d \ ~ 2l a [-"-"] d l 

l g 0.25a2°- 5a l[~™' n l t! l 

> |i[_„_ Xin+1]li (A')| " (2|^|)|9[-n, n ]^| 



-n-l,n+l 



<POI> 



'o.25a2°- 5Q| [~™' n l' !| 
(2\A\)\ d l- n ^ d \ 



However, since 

|£r-„-in+i]'«WI < I, this clearly gives a contradic- 

tion for small enough 9 and sufficiently large n (both larger than Ng and large 

enough so that ^ri^qU is much smaller than 0.5a). Therefore, our original as- 
sumption was wrong and is true, i.e. liuin^^ fj,(0 <-> d[— n, n] d ) = 0. 

□ 

We are now ready to complete the proof of (A). Choose /i to be any measure of 
maximal entropy on X, and fix any k, /, and e > 0. By Fact 13.81 we can choose 
n > k + I large enough that /i(0 <-> d[— n + k + l,n — k — l] d ) < ^_ k _[ k+l ^d\ ■ Then 
by stationarity of fx, fi(t <-> t + d[—n + k + l,n — k — l] d ) < ^_ k _i k+Vi d\ for all 
t G [-k-l,k + l] d . Since [-n, n] d D( + [— n + fc + ?, n - k - l] d for all such t, the 
event t 4-> d[—n, n] d is contained in the event (f>t + d[—n + k + l,n — k — l] d for all 
such t, and so fi(t O 9[— n, < rpfcZ^pnTi ■ Summing over all i G [— fc— I, k + l] d 
yields ^([-jfe - I, k + l] d O 9[-n, n] d ) < e. 

This implies that for any n, there is a set J7 n C Lr ntn id(X) with /x( [[/„]) > 1 — e 
such that any w G U n contains a closed contour consisting entirely of G-letters 
containing [—k — l,k + l] d in its interior. For any w G U ni if {7^} = {9^} is the 
collection of all such contours, then clearly j(w) := d([J Si) is the unique maximal 
such contour, i.e. any other such closed contour 7' for w is contained in the union 
of j(w) and its interior. Define B(w) to consist of the set of all sites of [— n, n) d 
on or outside j{w), and D(w) = w\gf w y We note that [U n ] can be written as 
a disjoint union of the sets [L\_ ntn ]d(X) n (D(w))] over all possible choices for 
D(w). (For clarity, we note that Li_ n<n id(X) n (D(w)) consists of all configurations 
x in Lr_ ntn w(X) for which x\b( w ) = D(w).) This means that y,, restricted to 
U n and then marginalized to [—k, k] d , can be written as a weighted average of the 
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conditional measures /j (\xU_k ud] | [-D(iu)]) over possible values of D(w), and since 
/_t is an MRF, this is actually a weighted average of A 7 ^ u ^|[_ fe fe ] C i. 

However, each 7(w) is a closed contour of G-letters with distance greater than I 
from [— fc, k] d , and so by Fact 13. 71 for any 7(w) and any r\ € Lq[_„ „]d(X) consisting 
only of G-letters, 

(7) ^(A^Vw^ll-w) <^lhMf|2-i 

Since the set [/„ has /i-measure at least 1 — e, and since /ilj-^jd restricted to [/„ 
can be decomposed as a weighted average of measures A 7 ^!^^,!, ([7]) implies that 

(8) d^lt-^.A"!^].,) <Z\{-k,k] d \2-± +e. 

By taking I — > 00 and e — > (thus forcing 71 — > 00, since n was chosen larger 
than k + I), we see that [i\[_ k k -\d is in fact uniquely determined by the conditional 
probabilities A* 5 . Since k was arbitrary, /i is the unique shift-invariant MRF with 
conditional probabilities A" 5 , implying by Proposition 12.161 that /i is the unique 
measure of maximal entropy on X , proving (A). 

□ 

We now state two corollaries of the proof of (A) , which will be useful later for 
the proofs of (B) - (D). 

Corollary 3.9. If X is an e^-full nearest neighbor 7L d SFT with unique measure of 
maximal entropy \i, then [i is the (unique) weak limit (as n — > 00 ) of A 1 '" for any 
sequence r/ n G Lg^_ n n ^d(X) of boundary configurations consisting only of G-letters. 

Proof. Choose any such sequence 7] n . For any k, ([8]) implies that as n — > 00, 
A 17 " |[_ fejfe ]d approaches fJ>\[-k,k] d weakly. Therefore, A' ,n — > /i. □ 

Corollary 3.10. If X is an ed-full nearest neighbor T, d SFT with unique measure 
of maximal entropy fi, then any configuration u G L(X) containing only G-letters 
on its inner boundary has positive ^-measure. 

Proof. Consider any such configuration u G Ls(X). It was shown in the proof of 
(A) that there exists T D S and a closed contour S € Lqt of G-letters containing S 
for which /-t([<5]) > 0. (Specifically, take k, I, n large enough that S C [— k — l,k + l] d 
and /i([U n ]) > 0, choose w £ U n with /-t([w]) > 0, and then take 8 = ^(w).) 
Since the concatenation uS has inner boundary consisting only of G-letters, by 
Corollary 12.311 it is globally admissible. Therefore, there exists a configuration 
v G L{X) with v\s = u and v\qt = 8, implying that A" 5 ((it)) > 0. Then /j([w]) > 
M (M) = M(M)MM I [<*]) = > 0. □ 

Remark 3.11. At first glance, (A) may appear to be an extension of the main result 
from [2] , which guaranteed uniqueness of MRFs corresponding to certain classes of 
conditional probabilities. However, this is not the case; even for e arbitrarily close to 
0, e-full SFTs may still support multiple MRFs with the same uniform conditional 
probabilities A" 5 , some corresponding to limits of boundary conditions involving 
B-letters. (For instance, in Example 12.331 both the point mass at Z and the 
Bernoulli measure of maximal entropy on {1, . . . , n} z are MRFs with conditional 
probabilities A* 5 .) We very much need the extra condition of maximal entropy to 
rule out all but one of these MRFs as "degenerate." 
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Proof of (B). Our strategy is to first show that we can compute h(X) by taking 
the exponential growth rate of globally admissible configurations whose boundaries 
contain only G-letters, and that we can bound the rate at which these approxima- 
tions approach h(X). Then, we can easily write an algorithm which counts such 
configurations, since by Corollary 12.311 a configuration with boundary containing 
only G-letters is globally admissible iff it is locally admissible. 

Fix any n, and denote by T the set <9[l,n] d . For any 6 € Lr(X), we will show 
that \L<i , n ]d(X) n (<5)| < „]d(X) n (G T }\, i.e. the number of globally admissi- 
ble configurations with shape [l,n] d whose restriction to T equals 6 is less than or 
equal to the number of globally admissible configurations with shape [1, n] whose 
restriction to T consists entirely of G-letters. The proof involves another replace- 
ment procedure similar to the one from the proof of Lemma 13.31 the difference is 
that for any u G L[ ln ^d(X) n (5), we will define only a single configuration f(u) in 
L[ l n ]d(X), rather than a set. Take G^(u), 1 < i < fc(it), to be the set of maximal 
connected components of locations of -B-letters in u which have nonempty intersec- 
tion with r, and for each i define Bi{u) = u\ c .^ to be the subconfiguration of u 
occupying Ci(u). Just as in the proof of Lemma 13.31 remove all Bi(u) from u, and 
fill the holes with configurations of G-letters, where the letter chosen to fill a site s 
encodes the letter it(s) and the information of whether s was on the boundary of its 
component Ci(u) or in the interior. For exactly the same reasons as in Lemma 13.31 
!i/u'=i> f(u) 7^ f(u'). We also note that all f{u) are in I/r l ra id(J s r)n(G r ), meaning 
that their restrictions to T consist entirely of G-letters. 

We have then shown that \Li 1 ^d(X) PI (6)\ < \L^ ln ^d(X) n (G r )|. By summing 
over all choices for S, we see that \L^ n ]d(X)\ < \A\^\L [1:7l] d(X) n (G r )|. This 
means that 

(9) h(X)< ^i og |i [1 , n]Ii (x)|<i(iog|i [1 , T[]!i (x)n(G r )| + |r|iog|A|) 

<^iog| W( x ) n(G-)| + ^H. 

We now make a simple observation: for any k and any configurations wt G 
L[ l n ]d(X) n (G r ), t 6 [1, k] d , define the concatenation u of all wt, which has shape 
Ut + (*« ~ + l)i ti( n + 1) ~ !]• Then u is made up of a union of locally 
admissible configurations where each pair is separated by a distance of at least 1, 
so it is locally admissible. Then by Corollary 12.311 since the outer boundary of u 
consists only of G-letters and e < 2 d+2 > u 1S a ^ so globally admissible, meaning in 
particular that it is a subconfiguration of a configuration in L^ k ^ n+1 ^d (X). This 
implies that for any k > 0, 

l^[l.fc(n+l)] d (^)l > \L[l, n ]d{X) PI (G r )| fe . 

By taking logs of each side, dividing by (k(n + l)) d , and letting k — > 00, we see that 

1 



(10) 



h(X) > j-—; (\og\L [hn] d(X)n(G r )\) 



22 



RONNIE PAVLOV 



The upper and lower bounds on h(X) given by dU and (fT0|) differ by 

<") 2J}2m + (A - Aw) n (« r >D * 

n \n a (n + l) a / 1 J n 

(n + l)"-^ w< , |i4| < « + + K 3dlogL4| 

n d (n + l) d 7i + 1 l_ n 

Since log |L[ ljn ]d(X)n(G r )| is between the bounds from ([9]) and (fT0|) , it is within 

^f^- of By Corollary EM Iog|L [1>n] *(*) H (G r )| = 

logj[l,n] d (X)n(G r )|. Finally, we note that log (\LA [ln]d {X) n (G r )|) can be com- 
puted algorithmically, in \A\ n steps, by simply writing down all possible 
configurations with alphabet A and shape and counting those which are 
locally admissible and have restriction to T consisting only of G-letters. 

Since we may invest \A\ n steps to get an approximation to h{X) with 

tolerance 3dlog " , clearly h(X) is computable in time e°(" verifying (B). 

□ 



Proof of ( C). For any n, we define X n to be the 7h d SFT consisting of all points of 
X in which all connected components of B-letters have size less than n. We will 
show that each X n has the UFP and that h(X n ) — > h(X) as n — > oo. 

Wc first verify that X n has the UFP with distance 2n. Consider any k, I with I > 
k+3n and any configurations w € L\_ k ^ d {X n ) andw' € L{-i,iy\[-k-2n,k+2n]<i-{ X n)- 
We will exhibit x G X n with x|[_ fc!fc] d = w and a;|[_ i; ;] 1 i\[_ fe _2„,fc+2n]<i = ^'j proving 
the UFP once one takes weak limits with I — > oo. 

Wc first use the fact that w, w' are globally admissible in X n to extend them to 
configurations v E L [ _ k _ n+l k+n _ l]d and v' E L[_ lil] d\ [ _ k _ n _ l>k+n+l ]d respectively. 
Then, in both v and v', remove any connected components of B-letters which have 
empty intersection with wor w'. Fill these with G-letters in some locally admissible 
way by Lemma [27301 creating configurations u and v! respectively. Since connected 
components of -BTetters in X n must have size less than n, u|gr_fc_„_|_i k+n—i] d an< ^ 
u ' \d[-k-n-i.k+n+i] d consist only of G-letters. (If this were not the case, then either u 
contained a connected component of _B-lcttcrs intersecting both [— k, k] d and d[— k — 
n + 1, k + ri — l] d or vl contained a connected component of -B-letters intersecting 
both [—1, l] d \ [—k — 2n, k + 2n] d and d[—k—n—l, k+n + l] d , and in either case such a 
component would have had size at least n, which is impossible since v, v' E L(X n ).) 
Then again by Corollary 12.301 the empty region d[—k — n, k + n] d between u and 
u' can be filled with G-letters in a locally admissible way, creating a new locally 
admissible configuration v" with shape [—l,l] d . Finally, we note that since w' was 
globally admissible, there exists x' E X with x / \^_ l ^d^_ k _ 2n ^+2n] d = w ' ' ■ Finally, 
we note that w' has "thickness" at least n, and that no letters on w' were changed 
in the construction of v" . Therefore, since X n is an SFT defined by forbidden 
configurations of size at most n, the point x E A 1, defined by x|r_jn<j = v" and 
x \z d \[-i.i] d = x '\i d \[-i.V\ d i s m X n , and we have proved that X n has the UFP with 
distance 2n. (See Figure Q] for an illustration of the creation of x.) 

We finish by verifying that h(X n ) — > h(X). We showed in the proof of (B) that 
for any collection Wt E Ln )n \d(X) n (G-^ 1 ^ ), t E [1, k] d , the concatenation u of all 
w t , which has shape Utli[l + (*» — l)( n + 1) — 1]) i s m L(X). 
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Figure 1. Filling between w and w' (shaded areas represent B- 
letters, white areas represent G-letters) 



To prove this fact, we invoked Corollary 12.311 which in fact says a bit more; it 
implies that u can be extended to a point iGlby appending only G-letters to u. 
Note that since each wt contains only n d sites, and since all letters of x outside u 
are in G, x does not contain any connected components of B-lctters with size more 
than n d . Therefore, x £ X n d, which implies that u £ L(X n d). By counting the 
possible choices for the collection (wi), we see that 

\L[l t k{n+l)] d { X n d )\ > \L[l,n] d ( X ) n (G-I 1 '^ )| fe . 

By taking logs of both sides, dividing by (k(n + l)) d , and letting k — > oo, we see 
that 

h(x nd ) > ^-^\ og \L [hn]d (X)n(G^ d )\. 

We now recall that in the proof of (B), we showed that ^ n ^d log \L^ l n ]d(X) n 
(G3Ii.nl*)! is within 3dl °8l A l of h(X), and so we have shown that h(X n d) > h(X) - 
3d log \A\ n ^ j m pjyj n g that, h(X n ) —t h(X) as n — s- oo. 

For any Z d aperiodic ergodic measure-theoretic dynamical system (Y, fi, St) with 
h(fi) < h(X), there then exists n for which h(n) < h(X n ). We recall from Section[2] 
that any Z d SFT with the UFP is a measure-theoretic universal model, and so there 
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exists a measure v on X n so that (X n , v, at) = (V, fx, St). Since the support of v is 
contained in A„, clearly it is contained in X as well, and we have verified (C). 

□ 

Proof of (D). We prove that /i is isomorphic to a Bernoulli measure by using the 
property of quite weak Bernoulli as defined in [10 . 

Definition 3.12. A measure fi on A z is called quite weak Bernoulli if for all 

e > 0, 

lim d ([J.\[-n(l-e),n(l-e)} d UZ d \[-n,n] d , H\[~n(l-e),n(l-e)] d x H\z d \[-n,n] d ) = 0- 

It is known that quite weak Bernoulli measures are isomorphic to Bernoulli 
measures (for instance, in [10] , they note that it implies the property of very weak 
Bernoulli as defined in [17], and that Theorem 1.1 of [17] shows that all very weak 
Bernoulli measures are isomorphic to Bernoulli measures) , and so it suffices to show 
that [i is quite weak Bernoulli. 

It is shown in |10| that fi is quite weak Bernoulli if and only if for all e > 0, 

(12) lim min a : (i({r) € L Z d\,_ n , n w(X) : 

d (Ml[-n(l-e),n(l-e)] d ; M' ? l[-n(l-e),n(l-e)]' i ) < a }) > 1 — « j = 0, 

where fj, v is the conditional distribution on [— n,n] d of /j, given r\. Since /i is an 
MRF with uniform conditional probabilities A 5 , we can replace fjP by A 5 , where 
8 := T]\d[—n,n] d - Therefore, it suffices to show that for all e > 0, 

(13) lim min < a : fJ,({8 S L g i_ n n id(X) : 

d e),«(l— e)] d 5 ^ \[-n(l-e),n(l-e)] d ) < a }) > 1 — a| = 0. 



We first note that combining Lemma [3731 with Corollary 13.91 yields the fact that 
for any finite T C Z rf , fi([B T ]) < A~' T ', where N = ^(e -1 — 4d— 4)J > Ad since e < 
12c [ +4 ■ By summing over all possible paths of i?-letters from d[— n(l — e), n(l — e)] d 
to d[— n,n] d , this implies that 

M ([-n(l-e),n(l-e)]^5[-n,n] d )< Y,M L N~ L = jj^ljf) . 

Since N > 4d, n ([-n(l - e),n(l - e)] d O d[-?7,77] d ) < 2~ en . Therefore, with /> 
probability at least 1 — 2~ en there exists a closed contour of G- letters containing 
[—r7,(l — e),n(l — e)] d in its interior and contained within [— n, n] d . Since [i is an 
MRF with uniform conditional probabilities A 5 , /u|r_„ . n ]d = X)<5 MI^D^ 5 , where the 
sum is over 8 € L 9 [„ n .^(A). Clearly, there then exists a set A C LQ^_ nn ^d(X) 
with /i([A]) > 1 - 2~ a5en so that for any 8 e A, the A 5 -probability that there 
exists a closed contour of G-letters containing [— n(l — e),n(l — e)] d and contained 



m [-n, 77] d is at least 1 - 2-°- 5m . 



As in the proof of (A), for any j £ A and any u 6 Li nn -\d(X) such that 
7i<5 G L(X), we define 7(77) to be the unique maximal closed contour of G-letters 
contained within [—77, n] d , which contains [— n(l— e), n(l— e)] d with A 5 probability at 
least 1 _2~°- 5c " by definition of A. Also define B{u) to be the set of sites of [—77, 77]^ 
outside 7(77), and D{u) = u\b( u )- Then [{it <G L^ nn ^d(X) : u8 G L(X)}} can be 
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written as a disjoint union of sets [L^_ n ^d(X) n (D(u))], meaning that except for a 
set of A 5 -measure at most 2~ 05en , A*|r_ n (]_ e \ n (]_ e Yiei can be written as a weighted 
average of /x ([x|r_ n (i_ e ) n (i_ e )id] | [Z)(u)]J over possible values of D(u). Since /j, is 
an MRF, this is in fact a weighted average of A 7 ^"' |r_ n (i_ e ) jn n_ e )ii. Finally, by 
Fact 13. 7\ for any j(u) ^ j{u') containing [— n(l — e),n(l — e)] d , 

d ^A 7(ll) | [ _„( 1 _ 2e ) ;n ( 1 _ 2£ )] £ i, A 7(u ) |[-„(i-26),n(i-2 £ )]< i ) < Z(2n) d 2~~ . 

So, for each 8 G A, except for a set of measure at most 2 -a5c ™, A <5 |[_„( 1 _ 2 e),n(i-2c)]< i 
can be decomposed as a weighted average of A' 7 ^'"-* | [_ rl ( 1 _2 e ) iri .(i_2e)] ti i an d each pair 
A 7 (")|[_„( 1 _ 2e ) i „(i_2£)]d ! A 7 '" )|[-„(i_2£).n(i-2e)] d has total variational distance less 
than Z{2n) d 2~ I r for some universal constant Z. Therefore, for any 5,8' G A, the 
pair A 5 |[_ n ( 1 _2£).n(i-2e)] d ; A 5 |[- n (i-2£).n(i-2£)] d rias total variational distance less 
than 2-°- 5tn + Z{2n) d 2- S r. 

Recall that except for a set of /^-measure at most 2 _0 ' 5c ", ju|r_n(i_2e),n,(i— 2e)]< i 
can be decomposed as a weighted average over A' 5 |r_ n ( 1 _2 e ) lTl (i_2e)]< 1 f° r ^ € A. This 
means that for any 8 G A, 

" n(l— 2e),n(l— 2e 

-n(l-2£),n(l-2£)] d ) < 2 ■ 2 • " l + Z (2n) 2 7. 

As n — > oo, the right-hand side of this inequality approaches 0, and /x(A) approaches 
1. We have then verified (fT3|) , so is quite weak Bernoulli, and therefore isomorphic 
to a Bernoulli measure. 

□ 

4. UNRELATEDNESS OF THE e-FULLNESS CONDITION TO MIXING CONDITIONS 

Often in 1 d symbolic dynamics, useful properties of an SFT follow from some 
sort of uniform topological mixing condition, such as block gluing or uniform filling. 
Examples of such properties are being a measure-theoretically universal model (fol- 
lows from UFP by [21]), the existence of dense periodic points (follows from block 
gluing by the argument in [28]). and entropy minimality (nonexistence of proper 
subshift of full entropy; follows from UFP by [53] ). In this section, we will give 
some examples to show that e-fullness and uniform topological mixing properties 
are quite different notions for nearest neighbor Z d SFTs. 

Example 4.1 (Non-topologically mixing). Clearly e-fullness implies no mixing 
conditions at all; the SFT from Example 12.331 can be made e-full for arbitrarily 
small e by increasing the parameter n, but is never topologically mixing since there 
are no points which contain both a and a 1. 

Example 4.2 (Topologically mixing but not block gluing). In [23], a I? SFT 
called the checkerboard island shift is defined. We briefly describe its properties 
here, but refer the reader to [3S] for more details. The checkerboard island shift 
C is defined by a set of legal 2x2 configurations, namely those appearing in 
Figure H] plus the 2x2 configuration of all blank symbols. Note that C is not a 
nearest neighbor SFT; we will deal with this momentarily. It is shown in |23j that 
C is topologically mixing, and in fact more is observed; any finite configuration 
w G L(C) is a subconfiguration of a configuration w' G L(C) with only blank 
symbols on the inner boundary. It is also shown in [23] that C does not have the 
uniform filling property. In fact, their proof also shows that C is not block gluing; 



26 



RONNIE PAVLOV 



they observe that any square checked configuration surrounded by arrows (e.g. the 
central 8x8 block of Figure [2]) forces a square configuration containing it of almost 
twice the size (e.g. the 14 x 14 configuration in Figure [5]), which clearly precludes 
block gluing. 




Figure 2 . A sample configuration from G 

First, we define a version of G which is nearest neighbor by passing to the second 
higher block presentation: define C to have alphabet A' consisting of the 2x2 
configurations from L(C), and the only adjacency rule is that adjacent 2x2 blocks 
must agree on the pair of letters along their common edge. (For instance, for letters 
a, b, c, d in the alphabet of C, ° \ \ c a would be a legal adjacent pair of letters in 
C) C is topologically conjugate to G, and so shares all properties of G described 
above. The reader can check that the alphabet A 1 of C has size 79. 

We can now make versions of C which are e-full for small e; for any N, define 
C' N to have alphabet A' N = G U A 1 , where G is a set of N "free" symbols with the 
following adjacency rules: each G-letter can appear next to any other G-letter, and 
each G-letter can legally appear next to any letter from A' consisting of four blank 
symbols from the original alphabet of G. The reader may check that the addition of 
these new symbols does not affect the above arguments proving topological mixing 
and absence of block gluing, and so each C' N is topologically mixing, but not block 
gluing. Also, since any G-letter can be legally followed in any direction by any 
other G-letter, C' N is ^p^-full. Clearly C' N can then be made e-full for arbitrarily 
small e by increasing the parameter N . 

This example can be trivially extended to a nearest neighbor 1 d SFT C'jf by 
keeping the same alphabet and adjacency rules and adding no transition rules along 
the extra d— 2 dimensions. Clearly C N d ^ is still topologically mixing and not block 
gluing, and can be made e-full for arbitrarily small e by taking N large. 

Example 4.3 (Block gluing but not uniform filling). In [25], a nearest neighbor 
Z 2 SFT called the wire shift is defined. We briefly describe its properties here, 
but refer the reader to for more details. For any integer N, the wire shift Wn 
has alphabet An = G U B, where B consists of six "grid symbols" illustrated in 
Figure [3] and G is a set of "blank tiles" labeled with integers from [1, N]. The 
adjacency rules are that neighboring letters must have edges which "match up" in 
the sense of Wang tiles; for instance, the leftmost symbol from Figure [3] could not 
appear immediately above the second symbol from the left. 

It is shown in Corollary 3.3 from [25] that Wn is block gluing for any TV. In 
Lemma 3.4 from [25], it is shown that for any N > 2, Wn is not entropy minimal, 
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ffl 

Figure 3. The symbols B from the alphabet of W N 

in other words Wn contains a proper subshift with topological entropy ft(Wjv). 
However, Lemma 2.7 from [25] shows that any 1 d SFT with the UFP is entropy 
minimal, and so Wn does not have the UFP for any TV > 2. 

Again any letter of G can be legally followed in any direction by any other letter 
of G, implying that Wn is -^g-full. Clearly then Wn can be made e-full for 
arbitrarily small e by increasing the parameter TV. 

This example can also be trivially extended to a nearest neighbor 7L d SFT W_y 
by keeping the same alphabet and adjacency rules and adding no transition rules 
along the extra d — 2 dimensions. Clearly wffl is still block gluing and does not 
have the UFP, and can be made e-full for arbitrarily small e by taking TV large. 

Example 4.4 (Uniform filling). Any full shift is e-full for any e > 0, and obviously 
has the UFP. 

Remark 4.5. We note that we can say a bit more about the extended checkerboard 
island shift C' N of Example l4.2l for large TV. As we already noted, it was shown in [23] 
that every w £ L(C) can be extended to a configuration w' € L(C) with only blank 
symbols from the alphabet of C on the inner boundary. Then, for any TV, we claim 
that any configuration w £ L{C' N ) can be extended to a configuration w' £ L(C' N ) 
with only G-letters on the inner boundary: any configuration w £ L(C' N ) looks like 
a recoded version of a configuration from G, possibly with some G-letters replacing 
letters of A' consisting of four blanks from the original alphabet of A, and so the 
same extension proved in [23j guarantees that w can be extended to w' with only G- 
lctters and four-blank letters from A\ which can itself be surrounded by a boundary 
of G-letters. If we take any TV for which N 7 ® 7g < td and denote the unique measure 
of maximal entropy on C' N by fi, then by Corollary 13.101 fJ,([w']) > 0, clearly 
implying that ([«;]) > 0- We have then shown that fi has full support, and in 
particular have shown the following: 

Theorem 4.6. For any e > 0, there exists an e-full nearest neighbor Z 2 SFT with 
unique measure of maximal entropy fi whose support is not block gluing. 

In other words, there really is no uniform mixing condition implied by the e- 
fullness property, even hidden within the support of the unique measure of maximal 
entropy 

5. Comparison with existing sufficient conditions for uniqueness of 
measure of maximal entropy 

We for now focus on property (A) from Theorem 13.11 i.e. the fact that for any 
d and small enough e, any e-full nearest neighbor Z d SFT has a unique measure of 
maximal entropy In this section, we will attempt to give proper context by giving 
some examples of conditions in the literature related to our condition. We first need 
a definition. 
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Definition 5.1. For a nearest neighbor Z d SFT X, a letter a of the alphabet A is a 
safe symbol if a is a legal neighbor of every letter of A in every cardinal direction 
±e l . 

Example 5.2. In [21], the classical Dobrushin uniqueness criterion for Markov 
random fields was used to prove the following result: 

Theorem 5.3 ([21], Proposition 5.1). For any nearest neighbor Z d SFT X with 
alphabet A such that at least \A\ ( 

V4<^+i+i ) °f ^ e ^ e ^ ers °f A are sa f e symbols, 
X has a unique measure of maximal entropy. 

This seems to be the first result to show that a large proportion of safe symbols 
is enough to guarantee uniqueness. 

Example 5.4. In [9j, methods from percolation theory, following techniques from 
PQ, were used to prove the following result. 

Theorem 5.5 ([3], Theorem 1.17). For any nearest neighbor Z d SFT X with al- 
phabet A such that at least \A\(y/l ~ p c {1 d )) of the letters of A are safe symbols, X 
has a unique measure of maximal entropy, where p c (Z d ) is the critical probability 
for site percolation on the d- dimensional hypercubic lattice. 

(This is not the precise statement of their theorem, but is an equivalent refor- 
mulation which better contrasts with Theorem I5.3I ) We will not define p c (Z d ) or 
discuss percolation theory here; for a good introduction to the subject, see [13] . 

Theorem 15.51 is stronger than Theorem 15.31 for d = 2: it is known that p c (l?) > 
0.5, so y/1 -Pc(% 2 ) < V<l5 < ^jtpj. However, for large d, p c {Z d ) = 

(PS), therefore ^J d +1+1 = 1 - ^ < 1 - ^ = y/l-p c (Z«), implying that 
Theorem 15.31 is stronger for large d. 

Example 5.6. A slightly more general result comes from [14], which requires an- 
other definition. 

Definition 5.7. For any Z d subshift X with alphabet A, the generosity of X is 
G(X) = -Lmm\{aeA : aS e L(X)}\, 

where the minimum ranges over S € A z s.t. 5a G X for at least one a G A. 

Theorem 5.8 ([2], Theorem 1.12). Any nearest neighbor Z d SFT with generosity 
at least 1+p 1 ( Z d) h as a unique measure of maximal entropy. 

The strength of Theorem 15.81 is that it allows one to consider SFTs without 
safe symbols. For instance, the n-checkerboard Z d SFT, defined by alphabet 
A = {l,...,n} and the adjacency rule that no letter may be adjacent to itself 
in any cardinal direction, has generosity 1 — which satisfies the hypotheses of 
Theorem 15.81 for large n. However, it has no safe symbols, and so cannot satisfy 
the hypotheses of Theorems 15.31 and 15.51 

Remark 5.9. We note that Haggstrom also showed in [M] that it is not possible 
for any d > 2 to give a lower bound on G(X) in Theorem 15.81 which would imply 
uniqueness of measure of maximal entropy for all Z d SFTs (not just nearest neigh- 
bor): Theorem 1.13 from Q3] states that for any d > 2 and any e > 0, there exists a 
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jd gjrT; x with more than one measure of maximal entropy such that G(X) > 1 — e. 
This implies that such a uniform lower bound would be impossible for part (A) of 
our Theorem 13.11 as well, since any nearest neighbor 7L d SFT with generosity more 
than 1 — e is clearly e-full. 

One property shared by all of each of these conditions is that they require all 
letters of A to satisfy some fairly stringent adjacency properties. For instance, if A 
contains even one letter which has only a single allowed neighbor in some direction, 
then it has at most one safe symbol and its generosity equals ^ (the minimum 
possible amount). The strength of the e-fullness condition is that it allows the 
existence of a small set of letters with bad adjacency properties, as long as the rest 
of the symbols are "close enough" to being safe symbols. 

6. Optimal value of (3 d 

It is natural to define the optimal value of /3d in Theorem 13.21 which guarantees 
uniqueness of the measure of maximal entropy. 

Definition 6.1. For every d, we define 

ay := inf{a : 3a nearest neighbor Z d SFT X 
with more than one measure of maximal entropy for which h(X) > (log \A\) — a.} 

We can determine ot\ exactly. 

Theorem 6.2. ot x = log 2. 

Proof. Suppose that X is a nearest neighbor Z SFT with alphabet A and more than 
one measure of maximal entropy. Since any measure of maximal entropy of X is 
supported in an irreducible component of X , clearly X has at least two irreducible 
components of entropy h(X). (See [20] for the definition of irreducible components 
and a simple proof of this fact.) Each of these components must then have alphabet 
size at least e HX \ and so \A\ > 2e h( - x K This implies that (log \A\) - h(X) > log 2, 
and so ol\ > log 2. 

However, the same idea shows that there exists a nearest neighbor Z SFT X 
with multiple measures of maximal entropy and for which (log \ A\) — h(X) = log 2; 
if X is the union of two disjoint full shifts on n symbols, then each of the full shifts 
supports a measure of maximal entropy for X. Therefore, a\ < log 2, completing 
the proof. 

□ 

We will now show that a 2 < log 2, meaning that in two dimensions, it is possible 
to have an SFT X in which two disjoint portions of the alphabet induce different 
measures of maximal entropy and can still coexist within the same point of X 
(unlike the one-dimensional case). 

Theorem 6.3. a 2 < log 2. 

Proof. In [5] , an example is given of a strongly irreducible nearest neighbor Z 2 SFT 
with two ergodic measures of maximal entropy, later called the iceberg model in 
the literature. We quickly recall the definition of this SFT. 

The iceberg model, which we denote by Im, is defined by a positive integer pa- 
rameter M. The alphabet is Am = {—M, . . . , —2, —1, 1,2,..., M}. The adjacency 
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rules of Im are that any letters with the same sign may neighbor each other, but 
a positive may only sit next to a negative if they are 1 and —1. It is shown in [8] 
that for any M > 4e28 2 , Im has exactly ergodic two measures of maximal entropy. 

Wc will now show that H(Im) is strictly greater than logM, which will imply 
that (log \Am I) — h(lM) < log 2. For any n, define the set P n of all configurations 
with shape [1, n] 2 consisting of letters from {1, . . . , M}. Clearly P n C L[ lin jj(Ijvf). 
By ergodicity of the Bernoulli measure giving each positive letter equal probability 
(or simply the Strong Law of Large Numbers), if we define G n to be the set of 

2 1 

configurations in P n with at least occurrences of the configuration l l l , then 
lim JWOO l ° g 7 [f" ■ = logM. In any u <G G n , it is simple to choose at least 
occurrences of l ! l with disjoint centers. Then, one can construct a set f{u) of 

Tl 2 

at least 2sm» configurations in I[i.„]2(Im) by independently either replacing the 
center of each of these 1-crosses by —1 or leaving it as a 1. It is easy to see that 
f(u) n /(«') = for any u ± v! € G n . Therefore, \L [1>n]a (l M )\ > \ \J ueGn /HI > 

n 2 

2sa/s \G n \, implying that 

h(I M ) = lim ^gl^ti.n] 2 ^/) > Um logl^nl + log2 = + log 2 

n->oo n 2 ~ n->oo n 2 5AI 5 5M 5 

Therefore, (log |Aw|) — h(Xn{) < log 2 — ^§t, and so since Im has two measures of 
maximal entropy for M = 10000 > 4e28 2 , a 2 < log 2 - 

□ 

Unsurprisingly, the sequence ad is nonincreasing in d. 
Theorem 6.4. ad+i < ad for all d. 

Proof. Fix any d and any e > 0. By definition, there exists a nearest neighbor Z d 
SFT X (with alphabet A) with fi± ^ [i 2 measures of maximal entropy for which 
h(X) > (log \A\) — ad — e. Wc may then define X' 1 to be the nearest neighbor Z d+1 
SFT containing all x £ A zd+1 for which each x\ Z d x rjy G X. In other words, X z 
has the adjacency rules for X in the el, ... , ed-directions, and no restrictions at all 
in the ed+i-direction. 

Then clearly h(X z ) — h(X) > (log — ad — e. Also, it is not hard to check that 
Hi and /x| are measures of maximal entropy for X z 7 where /if is the independent 
product of countably many copies of /i.; in the ej+i-direction. Therefore, a<j+i < 
ad + e. Since e was arbitrary, we are done. 

□ 

Theorem 13.21 implies that ad decays at most polynomially; ad > (3d — p^d 1 ^ ■ 
Our final result gives an upper bound on ad which also decays polynomially. 

Theorem 6.5. There exists a constant B so that ad < [Bd" i!i (iogd)- u ' /!i \ f or a ^ ^' 

Proof. The main tool in our construction is a theorem of Galvin and Kahn f |12j) 
about phase transitions for the hard-core shift with activities. Specifically, define a 
Gibbs measure with activity A € K + on the Z d hard-core shift T-Ld to be a measure 
fi with the property that for any n and any configurations w with shape S and 6 
with shape dS such that wS € L(Hd), 

^# ones in w 

KM I W) = "f^ m : — ■ 

' VL iiii/ \^ ones in v 

Z^G{0,1} S s.t. v5€L(-H d ) A 
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In other words, the conditional probability of w given a fixed S is proportional to 
A# oncs 111 One way to create Gibbs measures is to fix any boundary conditions 
5 n G LQ[_ n n ^d(Hd) on larger and larger cubes [— n, n] d , and take a weak limit 
point of the sequence of conditional measures At([ a:: l[-n.n] d ] | [6]). For the hard-core 
model, two boundary conditions of interest are 5 ejTl , which contains 1 on all odd 
t G d[—n,n) odd) and on all even t G d[—n,n] d even), and 6 0>n , 

which does the reverse. 

The main result of [12] states that there exists a universal constant C with the 
following property: for any dimension d and activity level A which is greater than 
Cd~° 25 (log d)° 75 , the sequences of conditional measures n{[x^_ nn ]d] \ [5 e>n ]) and 
/i([arr_ n „id] | [<5 ,n]) approach respective weak limits jj, e and fi , which are distinct 
Gibbs measures with activity A on Hd- Though it is not explicitly stated in |12j . it 
is well-known that each of \i a and /x e is a shift by one unit in any cardinal direction 
of the other, and in particular that both [i Q and [i e are invariant under any shift in 
(2Z) d . (This is mentioned in, among other places, [3].) The strategy used in [12] to 
show that fx =/= fx e is quite explicit: it is shown that for A satisfying the hypothesis 
of the theorem, /j, e (x(0) = 1) < fj, o (x(0) = 1). 

We now define %na to be the nearest-neighbor 7L d SFT with TV safe symbols 
{Oi, . . . ,0n} and a symbol 1 which cannot appear next to itself in any cardinal 
direction; Una is a version of the usual hard-core shift where the symbol has 
been "split" into TV copies. For any Gibbs measure /j, on the hard-core model Hd 
with activity A = ■h, define a measure jj, on Una by "splitting" the measure of 
any cylinder set [w] uniformly over 

all N* zcroos in w ways of assigning subscripts 
to symbols in w. It is easily checked that any such measure fl has the uniform 
conditional probabilities property from the conclusion of Proposition 12.161 (Propo- 
sition 1.20 from [9]), which stated that all measures of maximal entropy have this 
property. In fact, Proposition 1.21 from [S] gives a partial converse: for strongly ir- 
reducible SFTs, any shift-invariant measure with uniform conditional probabilities 
must be a measure of maximal entropy. 

Since Una is clearly strongly irreducible, this would show that fl^ and '\i are 
measures of maximal entropy on Una, were it not for the fact that these measures 
are not shift-invariant. However, their average ^(/I^ + /I^) clearly shares the uni- 
form conditional probability property, and is shift-invariant, and so is a measure of 
maximal entropy on "Hat^. In fact, it is the unique measure of maximal entropy on 
T~Lna- We will show, however, that the direct product of Una with itself can have 
multiple measures of maximal entropy. 

Define the nearest neig hbor Z d SFT H Nd with alphabet {(a, b) : a, b G 
{Oi, . . . , On, 1}}, where the adjacency rules from Una are separately enforced in 
each coordinate. In other words, (Oi, 1) may appear next to (1,02) since Oil and 
IO2 are each legal in Una, but (0i,l) cannot appear next to (1,1): though Oil 
is legal in Una, H is not. Define the measures v\ := X ju^ + ju^ X fx^) and 

v 2 '■= o(Mo x Me + Me x V^>) on T^Nd- ^ should be obvious that both v\ and v% 
have the uniform conditional probability property mentioned above and that both 
are shift-invariant. Since TL N d is strongly irreducible (it still has safe symbols, for 
instance (0i,0i)), both v\ and are therefore measures of maximal entropy on 

rL N,d i 

We claim that v\ ^ v 2 as long as TV < C _1 d°' 25 (log d)~ - 75 . If this condition 
holds, then > C*d-°- 25 (logd) - 75 , and so by [12], (i e (x(0) = 1) < fi o (x(0) = 1) for 
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the hard-core shift with activity A = j*. Clearly, /x^(x(0) = 1) = fj, e (x(0) = 1) and 
Ho{x(0) = 1) = /J^(x(0) = 1), so fi^(x(0) = 1) < j£,(x(0) = 1). For brevity, denote 
these probabilities by a and f3 respectively. By definition, i/i(x(0) = (1,1)) = 
\{a 2 + 1 ) and z^O^O) = (1,1)) = |(c*/3 + a/3). These are equal if and only if 
a = (3, which is not the case. Therefore, v\ 7^ 1/2 and Ti 2 N d has multiple measures 
of maximal entropy. 

Our final observation is that any way of placing arbitrary letters on even sites and 
only pairs (0*, Oj) on odd sites yields configurations in L(H 2 N d ), and so h(7i% d ) > 
i(log(7V+l) 2 +logiV 2 ) = \ogN(N+l). Therefore, for this SFT, log \A\ — h(H 2 N d ) < 
\og{N + l) 2 - \ogN(N + 1) = log (1 + jf) < jj. Choosing B = C*" 1 and TV = 
[C _1 (i - 25 (logc?)~ - 7,5 J now completes our proof. 

□ 

Just the fact that ad — > provides some quantification of the commonly believed 
heuristic that it should get easier, not harder, to have multiple measures of maximal 
entropy as d — > 00 , as there are more paths in which information can communicate. 

Our bounds on ad arc then 0(d~ 17 ) < ad < d~ ' 25+o ^ . We imagine that 
the true values of ad are closer to the upper bounds than the lower, but have no 
conjectures as to the exact rate. 
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